Skip to content

Conversation

@danielthegray
Copy link

In your post, you say: "Adding that !len is actually somewhat costly,
though I couldn’t figure out why."

My suspicion was that it is because the "!" operator would essentially
behave like a branch, returning 1 if the input is 0 and 0 otherwise.

So, my idea was to copy the table of lengths you have and create another
one for "error lengths" to get that same effect (0 when it's OK and 1
when there is an error, to ensure that it moves forward at least one
byte, as mentioned).

The throughput went up from 504 MB/s to 557 MB/s on my machine.

In your post, you say: "Adding that !len is actually somewhat costly,
though I couldn’t figure out why."

My suspicion was that it is because the "!" operator would essentially
behave like a branch, returning 1 if the input is 0 and 0 otherwise.

So, my idea was to copy the table of lengths you have and create another
one for "error lengths" to get that same effect (0 when it's OK and 1
when there is an error, to ensure that it moves forward at least one
byte, as mentioned).

The throughput went up from 504 MB/s to 557 MB/s on my machine.
@N-R-K
Copy link

N-R-K commented Jun 23, 2022

For what it's worth, I actually see the speed drop from ~647 MB/s to ~611 MB/s with this patch applied on my system (3700x).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants